Raptor: Integrating Checkpoints and Thread Migration for Cluster Management
نویسندگان
چکیده
Software distributed shared-memory (SDSM) provides the abstraction necessary to run shared-memory applications on cost-effective parallel platforms such as clusters of workstations. However, problems such as cluster component reliability and cluster management, which are not directly related to performance, need to be addressed before SDSM solutions can be widely adopted. This paper presents Raptor, a SDSM cluster management system based on checkpoint/recovery and thread migration. Raptor decouples the runtime system and application data from application threads, allowing efficient load balancing, resource allocation, and rollback recovery. There are two important features of the system. First, it reduces checkpoint overhead by only saving application-specific data that cannot be recreated at recovery time. Second, by integrating thread migration capability both at runtime or recovery, it allows the addition or removal of computing resources from a running application while adding little or no additional burden on the SDSM application programmer.
منابع مشابه
Efficient User-Level Thread Migration and Checkpointing on Windows NT Clusters
ion of running on a single shared memory multiprocessor, Brazos supports message passing by implementing the MPI library [20]. Thread migration in the context of a distributed system involves the movement of a computation thread from one currently executing process to another running process. Thread migration has been previously proposed as a tool for load-balancing and communication reduction ...
متن کاملEfficient User-Level Thread Migration and Checkpointing on Win
ion of running on a single shared memory multiprocessor, Brazos supports message passing by implementing the MPI library [20]. Thread migration in the context of a distributed system involves the movement of a computation thread from one currently executing process to another running process. Thread migration has been previously proposed as a tool for load-balancing and communication reduction ...
متن کاملEfficient Fine-Grain Thread Migration with Active Threads
Thread migration is established as a mechanism for achieving dynamic load sharing. However, fine-grained migration has not been used due to the high thread and messaging overheads. This paper describes a fine-grained thread migration system whose extensible event mechanism permits an efficient interface between threads and communications without compromising the modularity and performance of ei...
متن کاملLightweight Transparent Java Thread Migration for Distributed JVM
A distributed JVM on a cluster can provide a highperformance platform for running multi-threaded Java applications transparently. Efficient scheduling of Java threads among cluster nodes in a distributed JVM is desired for maintaining a balanced system workload so that the application can achieve maximum speedup. We present a transparent thread migration system that is able to support high-perf...
متن کاملNew Approach for Customer Clustering by Integrating the LRFM Model and Fuzzy Inference System
This study aimed at providing a systematic method to analyze the characteristics of customers’ purchasing behavior in order to improve the performance of customer relationship management system. For this purpose, the improved model of LRFM (including Length, Recency, Frequency, and Monetary indices) was utilized which is now a more common model than the basic RFM model apt for analyzing the cus...
متن کامل